69 research outputs found

    “Lossless” compression of high resolution mass spectra of small molecules

    Get PDF
    Fourier transform ion cyclotron resonance (FTICR) provides the highest resolving power of any commercially available mass spectrometer. This advantage is most significant for species of low mass-to-charge ratio (m/z), such as metabolites. Unfortunately, FTICR spectra contain a very large number of data points, most of which are noise. This is most pronounced at the low m/z end of spectra, where data point density is the highest but peak density low. We therefore developed a filter that offers lossless compression of FTICR mass spectra from singly charged metabolites. The filter relies on the high resolving power and mass measurement precision of FTICR and removes only those m/z channels that cannot contain signal from singly charged organic species. The resulting pseudospectra still contain the same signal as the original spectra but less uninformative background. The filter does not affect the outcome of standard downstream chemometric analysis methods, such as principal component analysis, but use of the filter significantly reduces memory requirements and CPU time for such analyses. We demonstrate the utility of the filter for urinary metabolite profiling using direct infusion electrospray ionization and a 15 tesla FTICR mass spectrometer

    Assembling proteomics data as a prerequisite for the analysis of large scale experiments

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Despite the complete determination of the genome sequence of a huge number of bacteria, their proteomes remain relatively poorly defined. Beside new methods to increase the number of identified proteins new database applications are necessary to store and present results of large- scale proteomics experiments.</p> <p>Results</p> <p>In the present study, a database concept has been developed to address these issues and to offer complete information via a web interface. In our concept, the Oracle based data repository system SQL-LIMS plays the central role in the proteomics workflow and was applied to the proteomes of <it>Mycobacterium tuberculosis</it>, <it>Helicobacter pylori</it>, <it>Salmonella typhimurium </it>and protein complexes such as 20S proteasome. Technical operations of our proteomics labs were used as the standard for SQL-LIMS template creation. By means of a Java based data parser, post-processed data of different approaches, such as LC/ESI-MS, MALDI-MS and 2-D gel electrophoresis (2-DE), were stored in SQL-LIMS. A minimum set of the proteomics data were transferred in our public 2D-PAGE database using a Java based interface (Data Transfer Tool) with the requirements of the PEDRo standardization. Furthermore, the stored proteomics data were extractable out of SQL-LIMS via XML.</p> <p>Conclusion</p> <p>The Oracle based data repository system SQL-LIMS played the central role in the proteomics workflow concept. Technical operations of our proteomics labs were used as standards for SQL-LIMS templates. Using a Java based parser, post-processed data of different approaches such as LC/ESI-MS, MALDI-MS and 1-DE and 2-DE were stored in SQL-LIMS. Thus, unique data formats of different instruments were unified and stored in SQL-LIMS tables. Moreover, a unique submission identifier allowed fast access to all experimental data. This was the main advantage compared to multi software solutions, especially if personnel fluctuations are high. Moreover, large scale and high-throughput experiments must be managed in a comprehensive repository system such as SQL-LIMS, to query results in a systematic manner. On the other hand, these database systems are expensive and require at least one full time administrator and specialized lab manager. Moreover, the high technical dynamics in proteomics may cause problems to adjust new data formats. To summarize, SQL-LIMS met the requirements of proteomics data handling especially in skilled processes such as gel-electrophoresis or mass spectrometry and fulfilled the PSI standardization criteria. The data transfer into a public domain via DTT facilitated validation of proteomics data. Additionally, evaluation of mass spectra by post-processing using MS-Screener improved the reliability of mass analysis and prevented storage of data junk.</p

    AYUMS: an algorithm for completely automatic quantitation based on LC-MS/MS proteome data and its application to the analysis of signal transduction

    Get PDF
    BACKGROUND: Comprehensive description of the behavior of cellular components in a quantitative manner is essential for systematic understanding of biological events. Recent LC-MS/MS (tandem mass spectrometry coupled with liquid chromatography) technology, in combination with the SILAC (Stable Isotope Labeling by Amino acids in Cell culture) method, has enabled us to make relative quantitation at the proteome level. The recent report by Blagoev et al. (Nat. Biotechnol., 22, 1139–1145, 2004) indicated that this method was also applicable for the time-course analysis of cellular signaling events. Relative quatitation can easily be performed by calculating the ratio of peak intensities corresponding to differentially labeled peptides in the MS spectrum. As currently available software requires some GUI applications and is time-consuming, it is not suitable for processing large-scale proteome data. RESULTS: To resolve this difficulty, we developed an algorithm that automatically detects the peaks in each spectrum. Using this algorithm, we developed a software tool named AYUMS that automatically identifies the peaks corresponding to differentially labeled peptides, compares these peaks, calculates each of the peak ratios in mixed samples, and integrates them into one data sheet. This software has enabled us to dramatically save time for generation of the final report. CONCLUSION: AYUMS is a useful software tool for comprehensive quantitation of the proteome data generated by LC-MS/MS analysis. This software was developed using Java and runs on Linux, Windows, and Mac OS X. Please contact [email protected] if you are interested in the application. The project web page is

    LipidXplorer: A Software for Consensual Cross-Platform Lipidomics

    Get PDF
    LipidXplorer is the open source software that supports the quantitative characterization of complex lipidomes by interpreting large datasets of shotgun mass spectra. LipidXplorer processes spectra acquired on any type of tandem mass spectrometers; it identifies and quantifies molecular species of any ionizable lipid class by considering any known or assumed molecular fragmentation pathway independently of any resource of reference mass spectra. It also supports any shotgun profiling routine, from high throughput top-down screening for molecular diagnostic and biomarker discovery to the targeted absolute quantification of low abundant lipid species. Full documentation on installation and operation of LipidXplorer, including tutorial, collection of spectra interpretation scripts, FAQ and user forum are available through the wiki site at: https://wiki.mpi-cbg.de/wiki/lipidx/index.php/Main_Page

    MicroRNA-96 Directly Inhibits γ-Globin Expression in Human Erythropoiesis

    Get PDF
    Fetal hemoglobin, HbF (α2γ2), is the main hemoglobin synthesized up to birth, but it subsequently declines and adult hemoglobin, HbA (α2β2), becomes predominant. Several studies have indicated that expression of the HbF subunit γ-globin might be regulated post-transcriptionally. This could be confered by ∼22-nucleotide long microRNAs that associate with argonaute proteins to specifically target γ-globin mRNAs and inhibit protein expression. Indeed, applying immunopurifications, we found that γ-globin mRNA was associated with argonaute 2 isolated from reticulocytes that contain low levels of HbF (<1%), whereas association was significantly lower in reticulocytes with high levels of HbF (90%). Comparing microRNA expression in reticulocytes from cord blood and adult blood, we identified several miRNAs that were preferentially expressed in adults, among them miRNA-96. The overexpression of microRNA-96 in human ex vivo erythropoiesis decreased γ-globin expression by 50%, whereas the knock-down of endogenous microRNA-96 increased γ-globin expression by 20%. Moreover, luciferase reporter assays showed that microRNA-96 negatively regulates expression of γ-globin in HEK293 cells, which depends on a seedless but highly complementary target site located within the coding sequence of γ-globin. Based on these results we conclude that microRNA-96 directly suppresses γ-globin expression and thus contributes to HbF regulation

    A Semantic Web Management Model for Integrative Biomedical Informatics

    Get PDF
    Data, data everywhere. The diversity and magnitude of the data generated in the Life Sciences defies automated articulation among complementary efforts. The additional need in this field for managing property and access permissions compounds the difficulty very significantly. This is particularly the case when the integration involves multiple domains and disciplines, even more so when it includes clinical and high throughput molecular data.The emergence of Semantic Web technologies brings the promise of meaningful interoperation between data and analysis resources. In this report we identify a core model for biomedical Knowledge Engineering applications and demonstrate how this new technology can be used to weave a management model where multiple intertwined data structures can be hosted and managed by multiple authorities in a distributed management infrastructure. Specifically, the demonstration is performed by linking data sources associated with the Lung Cancer SPORE awarded to The University of Texas MD Anderson Cancer Center at Houston and the Southwestern Medical Center at Dallas. A software prototype, available with open source at www.s3db.org, was developed and its proposed design has been made publicly available as an open source instrument for shared, distributed data management.The Semantic Web technologies have the potential to addresses the need for distributed and evolvable representations that are critical for systems Biology and translational biomedical research. As this technology is incorporated into application development we can expect that both general purpose productivity software and domain specific software installed on our personal computers will become increasingly integrated with the relevant remote resources. In this scenario, the acquisition of a new dataset should automatically trigger the delegation of its analysis

    Extraction of pure components from overlapped signals in gas chromatography-mass spectrometry (GC-MS)

    Get PDF
    Gas chromatography-mass spectrometry (GC-MS) is a widely used analytical technique for the identification and quantification of trace chemicals in complex mixtures. When complex samples are analyzed by GC-MS it is common to observe co-elution of two or more components, resulting in an overlap of signal peaks observed in the total ion chromatogram. In such situations manual signal analysis is often the most reliable means for the extraction of pure component signals; however, a systematic manual analysis over a number of samples is both tedious and prone to error. In the past 30 years a number of computational approaches were proposed to assist in the process of the extraction of pure signals from co-eluting GC-MS components. This includes empirical methods, comparison with library spectra, eigenvalue analysis, regression and others. However, to date no approach has been recognized as best, nor accepted as standard. This situation hampers general GC-MS capabilities, and in particular has implications for the development of robust, high-throughput GC-MS analytical protocols required in metabolic profiling and biomarker discovery. Here we first discuss the nature of GC-MS data, and then review some of the approaches proposed for the extraction of pure signals from co-eluting components. We summarize and classify different approaches to this problem, and examine why so many approaches proposed in the past have failed to live up to their full promise. Finally, we give some thoughts on the future developments in this field, and suggest that the progress in general computing capabilities attained in the past two decades has opened new horizons for tackling this important problem

    Identification of Contractile Vacuole Proteins in Trypanosoma cruzi

    Get PDF
    Contractile vacuole complexes are critical components of cell volume regulation and have been shown to have other functional roles in several free-living protists. However, very little is known about the functions of the contractile vacuole complex of the parasite Trypanosoma cruzi, the etiologic agent of Chagas disease, other than a role in osmoregulation. Identification of the protein composition of these organelles is important for understanding their physiological roles. We applied a combined proteomic and bioinfomatic approach to identify proteins localized to the contractile vacuole. Proteomic analysis of a T. cruzi fraction enriched for contractile vacuoles and analyzed by one-dimensional gel electrophoresis and LC-MS/MS resulted in the addition of 109 newly detected proteins to the group of expressed proteins of epimastigotes. We also identified different peptides that map to at least 39 members of the dispersed gene family 1 (DGF-1) providing evidence that many members of this family are simultaneously expressed in epimastigotes. Of the proteins present in the fraction we selected several homologues with known localizations in contractile vacuoles of other organisms and others that we expected to be present in these vacuoles on the basis of their potential roles. We determined the localization of each by expression as GFP-fusion proteins or with specific antibodies. Six of these putative proteins (Rab11, Rab32, AP180, ATPase subunit B, VAMP1, and phosphate transporter) predominantly localized to the vacuole bladder. TcSNARE2.1, TcSNARE2.2, and calmodulin localized to the spongiome. Calmodulin was also cytosolic. Our results demonstrate the utility of combining subcellular fractionation, proteomic analysis, and bioinformatic approaches for localization of organellar proteins that are difficult to detect with whole cell methodologies. The CV localization of the proteins investigated revealed potential novel roles of these organelles in phosphate metabolism and provided information on the potential participation of adaptor protein complexes in their biogenesis

    Deciphering lipid structures based on platform-independent decision rule sets

    Get PDF
    We developed decision rule sets for Lipid Data Analyzer (LDA; http://genome.tugraz.at/lda2), enabling automated and reliable annotation of lipid species and their molecular structures in high-throughput data from chromatography-coupled tandem mass spectrometry. Platform independence was proven in various mass spectrometric experiments, comprising low- and high-resolution instruments and several collision energies. We propose that this independence and the capability to identify novel lipid molecular species render current state-of-the-art lipid libraries now obsolete
    corecore